customProDB: an R package to generate customized protein databases from RNA-Seq data for proteomics search

نویسندگان

  • Xiaojing Wang
  • Bing Zhang
چکیده

UNLABELLED Database search is the most widely used approach for peptide and protein identification in mass spectrometry-based proteomics studies. Our previous study showed that sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in the samples and thus improve protein identification. More importantly, single nucleotide variations, short insertion and deletions and novel junctions identified from RNA-Seq data make protein database more complete and sample-specific. Here, we report an R package customProDB that enables the easy generation of customized databases from RNA-Seq data for proteomics search. This work bridges genomics and proteomics studies and facilitates cross-omics data integration. AVAILABILITY AND IMPLEMENTATION customProDB and related documents are freely available at http://bioconductor.org/packages/2.13/bioc/html/customProDB.html.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Human Proteomic Variation Revealed by Combining RNA-Seq Proteogenomics and Global Post-Translational Modification (G-PTM) Search Strategy

Mass-spectrometry-based proteomic analysis underestimates proteomic variation due to the absence of variant peptides and posttranslational modifications (PTMs) from standard protein databases. Each individual carries thousands of missense mutations that lead to single amino acid variants, but these are missed because they are absent from generic proteomic search databases. Myriad types of prote...

متن کامل

ROTS: An R package for reproducibility-optimized statistical testing

Differential expression analysis is one of the most common types of analyses performed on various biological data (e.g. RNA-seq or mass spectrometry proteomics). It is the process that detects features, such as genes or proteins, showing statistically significant differences between the sample groups under comparison. A major challenge in the analysis is the choice of an appropriate test statis...

متن کامل

MSProGene: integrative proteogenomics beyond six-frames and single nucleotide polymorphisms

UNLABELLED Ongoing advances in high-throughput technologies have facilitated accurate proteomic measurements and provide a wealth of information on genomic and transcript level. In proteogenomics, this multi-omics data is combined to analyze unannotated organisms and to allow more accurate sample-specific predictions. Existing analysis methods still mainly depend on six-frame translations or re...

متن کامل

Galaxy Integrated Omics: Web-based Standards-Compliant Workflows for Proteomics Informed by Transcriptomics*

With the recent advent of RNA-seq technology the proteomics community has begun to generate sample-specific protein databases for peptide and protein identification, an approach we call proteomics informed by transcriptomics (PIT). This approach has gained a lot of interest, particularly among researchers who work with nonmodel organisms or with particularly dynamic proteomes such as those obse...

متن کامل

CauloBrowser: A systems biology resource for Caulobacter crescentus

Caulobacter crescentus is a premier model organism for studying the molecular basis of cellular asymmetry. The Caulobacter community has generated a wealth of high-throughput spatiotemporal databases including data from gene expression profiling experiments (microarrays, RNA-seq, ChIP-seq, ribosome profiling, LC-ms proteomics), gene essentiality studies (Tn-seq), genome wide protein localizatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2013